MPP: A Matlab Preprocessor ver 05-May-2023

Synopsis

Pre-processors are part of most modern languages for good reasons, however, Matlab does not come with one. MPP was designed to fill this void. An example is given to show how to make a Matlab GUI application more readable and easier to modify. Execution speed is also often improved. An important feature of MPP is that line numbers are preserved. Thus when Matlab reports an error at line #n, you can correct the error by going to line n in the preprocessor source file. This MPP subfolder (under plt) includes the following files:

MPP.CPP	Source code for mpp. Compatible with most C and C++ compilers. Before using MPP you must compile this source to create an executable program. However you can skip this step if you are using a Windows system by making use of next file listed below.
MPP.b64	Executable for Windows/DOS (compiled with the Borland C++ 5.5 compiler using the command bcc32 MPP.CPP). This file is in base64 format because binary files are not allowed on the file exchange. To convert this file to the binary file MPP.exe type the following commands: >> cd(fileparts(which('plt'))) >> cd MPP >> b64decode MPP.b64 .exe Or if you prefer you could use an online converter such as base64decode.org. The file size of the created exe file will be 65536 bytes if it has been decoded properly. If you are using a different operating system you must create the executable by compiling the C source file. Verify that it is compiled correctly by comparing the output .m files with the ones supplied for the 3 examples below. (I would appreciate it if you share your executable file with me so I can share it with other users.)
examp0.mi examp0.m	simple example 0 The .m file is the output of running MPP with the .mi source file This example (both the .mi and .m forms) is also shown below in section B of this document
examp1.mi examp1.m	demonstrates conditional compilation
modelA.m modelB.m modelC.m	used by examp1.mi
sum5s.mi sum5s.m	An GUI example program demonstrating many features of MPP

Motivation

In a large Matlab programming project, hundreds of constants may be needed to represent the various constraints and conditions of the problem to be solved as well as the myriad choices made by the programmer. If all these constants are sprinkled throughout the code, it quickly becomes unreadable and difficult to modify. The solution to this problem is to define variable names for each required constant at the beginning of the code (or in one or more separate .m script files).

For example, in your code, you may define a variable such as:

slider_width = .2;

Besides making the program more intelligible, this definition makes it trivial to change the slider width if that is ever required. This is nothing new. It's the commonly accepted style in any programming language. In fact, most languages provide a preprocessor to make it easy and efficient to define literals to represent constants (e.g. the #define statement in C).

Note that a definition such as the slider_width constant mentioned above does not require a preprocessor and most good Matlab programmers would already use many definitions such as this. Note however that if you have several functions in your .m file you would have to declare the variable as global so that you could use it in all the functions defined in the file. Also, nothing is preventing one of the functions from modifying the variable which may be confusing if you originally intended it to be a constant. Recent versions of Matlab allow the definition of constant class properties, but few Matlab programmers understand object oriented programming. Also making use of such constructs prevents you from distributing your program to users of older Matlab versions. Also since a new variable is defined for each constant, there is a time penalty due to Matlab's interpretive environment. In reality, this penalty is usually insignificant, but this concern and the global issues mentioned above often encourage an unreadable and difficult-to-maintain style with constants liberally scattered throughout the .m code. Another problem is that Matlab provides no easy way to define enumerated types. The MPP preprocessor was designed to solve these problems.

It is conceivable you could use a preprocessor designed for C or other languages on your Matlab source. However, MPP is far more suitable for this task compared to other preprocessors because of these two MPP features:

The number of lines in MPP's output file is the same as the number of lines in its input file. This is important so that when Matlab reports an error line number, it is easy to go to the line at fault in your .mi source code.
Most of the MPP preprocessor constructs are designed in such a way that the source code can be run by Matlab whether or not the source was passed through MPP. (Not all of the constructs fit that model however.)

Calling sequence: MPP FileIn [FileOut] [-s]

where:

FileIn	Full path and file name for the MPP input (required argument). If the extension is omitted, .mi will be assumed. If no path is given, the local directory is assumed.
FileOut	Full path and file name for MPP output (optional). If the extension is omitted, .m will be assumed. If this argument is not included, then MPP will write its output to a file with the same name as FileIn except that the extension is changed to .m. If the input file already has .m as its extension, then MPP will output its results to MPP.M.
-s	This option tells MPP to strip the comments from the output file. This switch may be placed anywhere after MPP on the command line. If one of the lines in the input file contains a comment and nothing else, then a line containing the single character % is written to the output file (thus maintaining the correspondence between input and output line numbers). Stripping comments saves disk space and may increase loading and execution speed as well. The comments of course are not lost, since the .mi file is the true source. NOTE: The comment lines at the top of the file are not removed since they are needed to provide the help text when you type help func_name from the command window.

MPP makes a single pass scan through input file FileIn (except with the -s switch, which uses an additional scan) and passes all characters to output file FileOut, unmodified except in the following four situations:

1. Include Files

When the string %include is found, an include section is begun. This string may be anywhere on a line, but it must be exact. (% include or %Include will not initiate an include section. The form of an include section is as follows:


     %include
       include1;  % comments are allowed here
       include2;  % This refers to an include file named "include2.m"
       includeN;
     %end_include

To find the include file, MPP will first look for the specified file (with a .m extension) in the current directory. If the file is not found, MPP will look for a list of include directories in a file called MPP.INI in the local directory. If MPP.INI is not in the local directory, MPP will try to find it in the same directory that contains MPP.EXE. Once MPP.INI is found, the first line of that file is read which should contain a path string such as: "c:\Matlab\projX\wcom".

MPP will search the specified directory for the include file. If it doesn't find it, the second line of MPP.INI will be read. If the end of MPP.INI is reached before finding the include file, an error message is reported.

Include files may contain only three types of lines:

Comment lines
blank lines
lines of the form token = ReplacementText.

The last form will be treated in the same manner as lines within a define section as described in section 2 below. The replacement text may span multiple lines by using the Matlab line continuation symbol (three consecutive dots).

The %include and %end_include lines are passed to the output file, but the include file names in between are not. If the first line of the include file is a comment (which is a good practice), then that line will be inserted into the output file in place of the include file name (this preserves line numbers). Otherwise, the include file name is replaced with a line of the form: % IncludeFilename.m;.

2. Literal assignments

A define section is begun when an include file is opened for input or when the string %define is found. An example of a define section follows:


%define 
   literal1 = RepTxt1;                % comments are allowed here too
   literal2 = RepTxt2;
   literal3 = RepTxt3;
   literal4 = literal1(1,a,literal2);
   literal5 = 10 * literal2 + literal3;
   c6 = 'abc';  c7 = 'def';            % multiple assignments per line are allowed
%end_define

What is a literal?

A literal can be any contiguous string of alphanumeric and underscore (_) characters that don't begin with a digit.
Also the $ character is allowed but only as either the first or the last character of the literal (see section C)
And finally the ( character is allowed in a literal but only as the last character. This is to allow function definitions (see last four lines of examp0 in section B).

Defining the replacement text:

The replacement text for the literal consists of all the characters after the equal sign until the end of the line, excluding the white space after the = and at the end of the line.
Comments are also not included in the replacement text.
The replacement text also does not include the semicolon used to terminate the replacement text (and actually, that semicolon is not required).
The replacement text may include any character except % and ;, although ; is allowed if it's used as a separator between elements of an array. If you need these characters as replacement text in other circumstances you may use the predefined literals mentioned below in section C.
As shown in the last line of the above example, you can make several assignments on one line by separating each with a semicolon.

If the replacement text contains a literal that is previously defined within a define section, then that literal is replaced with its previously defined replacement text. For example, the replacement text assigned to literal4 above would be:

RepTxt1(1,a,RepTxt2)

In a limited set of situations, MPP will also do arithmetic simplifications. For example, if RepTxt2 and ReplTxt3 were both numbers, then the replacement text assigned to literal5 will be reduced to a single number. MPP will simplify the replacement text if the expression after the equal sign consists only of these 5 operations:

Addition	+
Subtraction	-
Multiplication	*
Division	/
Association	( )

Also, all the operands must be numbers, or literals which have previously been assigned to numbers (or expressions that have been reduced to numbers). If any of these conditions are not met, then MPP will simply associate the literal with the unmodified replacement text.

If the define section was part of the original MPP input file (i.e. it was not in an include file) then while MPP is saving the literal assignments it also passes the define section through to the output file modified by inserting an extra % character at the beginning of the line. Thus in the example above the following text would be output.


%define
  %literal1 = RepTxt1;                 % comments are allowed here too
  %literal2 = RepTxt2;
  %literal3 = RepTxt3;
  %literal4 = literal1(1,a,literal2);
  %literal5 = 10 * literal2 + literal3;
  %c6 = 'abc'; %c7 = 'def';            % multiple assignments per line are allowed
%end_define

When MPP opens an include file, it treats the contents as a define section in a similar way to that described above, except that no text is passed on to the output file while the literal assignments are being made. This behavior assures that the number of lines in InFile will be the same as the number of lines created in OutFile. This is important because of Matlab's use of line numbers for locating errors.

3. Literal substitutions

When MPP encounters a literal in the input stream that has been previously defined in a define section, then MPP outputs the replacement text to the output file instead of the original literal. There are three exceptions to this rule.:

MPP will not replace the literal with its replacement text if the literal is within a comment. (i.e. on a line following a %).
MPP will not replace literals that are within a quoted string (a group of characters beginning and ending with a ' character). Actually, that definition of a quoted string is overly simplistic since to represent a quote character within a quoted string Matlab uses two quotes. More accurately define a quoted string as a group of characters which begins with a ' and ends with an odd number of consecutive ' characters. For example, the following is a quoted string:'set(gcf,''WindowButtonUpFcn'','''');'
You can prevent a literal replacement by following the literal with two underscore characters.
For example: aa_ _ = func1_ _(arg66_ _);
will be translated to: aa = func1(arg66);
That translation remains the same even if any of these literals (aa, func1, arg66, aa_ _, func1_ _, arg666_ _) have been defined

4. Assignment operator simplifications

Even outside of a define section, when MPP encounters text of the form:

   literal = expression;

where expression meets all the conditions specified above for the define sections, then MPP will replace the expression with a single number. For example, if InFile contains the string:

   if val == 0 xy = 2/5; else xy = 1+5/2;

Then the following text will be written to OutFile:

   if val == 0 xy = 0.4; else xy = 3.5;

Note that this is the only case where expressions are simplified outside of define sections. Thus the line:

   if val == 0 xy = sqrt(1+5/2);

would not be simplified. (A possible extension to MPP?)

A. MPP's handling of the vector operator "[ ]"

Since the vector operator is not listed among the five expression operators, you would expect a left or right bracket to prevent any expression containing it from being evaluated. While this is true in general, there are a few special cases that are so useful, that MPP was made to recognize this operator in those situations. The first of these special cases allows MPP to simplify expressions that are elements of a vector within a define section. For example:


%define
   vec1 = [3, 2*2, 5+1];
%end_define

In this case, the replacement text for literal vec1 would be: [3, 4, 6]

The delimiters allowed between the elements of vectors defined within a define section are the comma (for row vectors) and the semicolon (for column vectors). In Matlab the two vectors

vec1 = [3, 2*2, 5+1];
vec1 = [3 2*2 5+1];

are equivalent. (i.e. the commas are not required). However, when defining a vector within a define section, the commas are required for MPP to recognize the replacement text as a vector. And likewise, the semicolon is a required delimiter for a column vector within a define section. Note that MPP will not simplify vectors occurring outside of a define section.

The second of these special cases allow MPP to extract an element of a previously defined vector. Thus if vec1 is defined as above, then vec1(2) will be replaced with the number 4. This replacement would take place wherever vec1(2) occurs, both inside and outside of define sections.

Although you may define literals that contain nested brackets, MPP will not be able to simplify expressions within such definitions. This will also prevent MPP's ability to extract a single element from the vector. If you do this anyway, you may get unintended results. For example, suppose one of the lines of a define section is:

a = [3, 4, [5, 6]];

Then later if you refer to a(3), MPP will substitute [5, 6].

That may not be what you intended, since in Matlab [3, 4, [5, 6]] is a different way of writing the 1 by 4 vector: [3, 4, 5, 6]

The following example (section B) will help you learn how MPP treats vectors.

B. An example: examp0.mi

Suppose the file examp0.mi contains the following:


%define
  pi = 3.1415926;    e = 2.7182818284590;
  big = 1e20 * pi * e;
  a = 3;
  car = 9999;
  i = -((2.0 - 1e4 + car) * a + 1) * 10 * 10 / 2;
  b = [2*3+a, 4*5, 6*7+10, car];
  x = [10.1, 10.2, 0.00001*a, 1e4];     % note that the leading 0 is needed
  v = [b(2), x(3), b(2)*x(3)];
  labels = ['Label01'; 'Label02'; 'Label03'; 'Label04'];
  bx = b + x;  % this will work, but the next line will be much more efficient
  bxx = [b(1)+x(1), b(2)+x(2), b(3)+x(3), b(4)+x(4)]; % see expansion of this
  EMPTY = isempty;                % synonym for isempty function
  NONE( = isempty(;               % another synonym for isempty function
  PHASE( = (180/pi) * angle(;     % phase of complex argument in degrees
  InputGain( = siglab('InpGain',; % synonym for siglab input gain call
%end_define

%  yy = big;  ii = i;
   yy = big;  ii = i;

%  bbb = b;   bbb = b(4) * b(1);
   bbb = b;   bbb = b(4) * b(1);

%  xx = x(1) + aaa + x(3) + x(2);   xx = x(1) + x(3) + x(2);
   xx = x(1) + aaa + x(3) + x(2);   xx = x(1) + x(3) + x(2);

%  x4 = v * v(3);
   x4 = v * v(3);

%  ebx = bx;
   ebx = bx;
   
%  ebxx = bxx;
   ebxx = bxx;

%  ee = 2e3 + 2.0e-2 * 3.e2;
   ee = 2e3 + 2.0e-2 * 3.e2;

%  xlabel([labels(2),labels(4)]);
   xlabel([labels(2),labels(4)]);

%  if q1 == q2  ex1 = 3/4; else ex1 = 4/3; end;
   if q1 == q2  ex1 = 3/4; else ex1 = 4/3; end;

%  if EMPTY(wg) wg = 1;   if NONE(wg) wg = 1;
   if EMPTY(wg) wg = 1;   if NONE(wg) wg = 1;

%  deg = PHASE(xx + yy*j);   InputGain(Chan,Level);
   deg = PHASE(xx + yy*j);   InputGain(Chan,Level);

Then after typing the dos command MPP examp0, the file examp0.m will be written as follows:


%define
  %pi = 3.1415926;    %e = 2.7182818284590;
  %big = 1e20 * pi * e;
  %a = 3;
  %car = 9999;
  %i = -((2.0 - 1e4 + car) * a + 1) * 10 * 10 / 2;
  %b = [2*3+a, 4*5, 6*7+10, car];
  %x = [10.1, 10.2, 0.00001*a, 1e4];     % note that the leading 0 is needed
  %v = [b(2), x(3), b(2)*x(3)];
  %labels = ['Label01'; 'Label02'; 'Label03'; 'Label04'];
  %bx = b + x;  % this will work, but next line will be much more efficient
  %bxx = [b(1)+x(1), b(2)+x(2), b(3)+x(3), b(4)+x(4)]; % see expansion of this
  %EMPTY = isempty;                % synonym for isempty function
  %NONE( = isempty(;               % another synonym for isempty function
  %PHASE( = (180/pi) * angle(;     % phase of complex argument in degrees
  %InputGain( = siglab('InpGain',; % synonym for siglab input gain call
%end_define

%  yy = big;  ii = i;
   yy = 8.539734077001e+20;  ii = -200;

%  bbb = b;   bbb = b(4) * b(1);
   bbb = [9,20,52,9999];   bbb = 89991;

%  xx = x(1) + aaa + x(3) + x(2);   xx = x(1) + x(3) + x(2);
   xx = 10.1 + aaa + 3e-05 + 10.2;   xx = 20.30003;

%  x4 = v * v(3);
   x4 = [20,3e-05,0.0006] * 0.0006;

%  ebx = bx;
   ebx = [9,20,52,9999] + [10.1,10.2,3e-05,10000];
   
%  ebxx = bxx;
   ebxx = [19.1,30.2,52.00003,19999];

%  ee = 2e3 + 2.0e-2 * 3.e2;
   ee = 2006;

%  xlabel([labels(2),labels(4)]);
   xlabel([ 'Label02', 'Label04']);

%  if q1 == q2  ex1 = 3/4; else ex1 = 4/3; end;
   if q1 == q2  ex1 = 0.75; else ex1 = 1.333333333333; end;

%  if EMPTY(wg) wg = 1;   if NONE(wg) wg = 1;
   if isempty(wg) wg = 1;   if isempty(wg) wg = 1;

%  deg = PHASE(xx + yy*j);   InputGain(Chan,Level);
   deg = (180/3.1415926) * angle(xx + yy*j);   siglab('InpGain',Chan,Level);

C. Pre-defined literals and their use

These four literals are pre-defined (MPP syntax would prevent you from defining them using a define section):

Literal	Value	Use
c$	%	comment character
s$	;	statement terminator
$$	'	quote character
FileName$		MPP's input file (without extension)

The comment character can be used to selectively comment out debugging code, or even as a way of conditional compilation. For example:

  %define
    IF_UNIX = c$;  % comment these lines out when compiling for Unix
    IF_DOS  = ;    % comment these lines out when compiling for Unix
  % IF_UNIX = ;    % comment these lines out when compiling for DOS
  % IF_DOS  = c$;  % comment these lines out when compiling for DOS
  %end_define
  IF_UNIX [s,w] = unix('!rm dir1/dir2/file');
  IF_DOS  eval('!del dir1\dir2\file.exe');

The example above shows what the code would look like when coding for a windows environment. The IF_UNIX literal is defined as a comment character, thus commenting out the line with the call to unix and the IF_DOS literal is defined as a null string so that the line with the eval would be executed.

Replacing Matlab if else clauses with these conditionals can make your program smaller and faster. Another example of conditional compilation can be found in the included examp1.mi example.

While debugging a program you may want to tell Matlab to display a specific set of variables. Usually, this is done by removing the semicolon at the end of each expression that you want in your debug printout. The drawback of this method is that once you no longer need the debug printout it may be difficult to find all the missing semicolons. Furthermore, once you have put these missing semicolons back in, later if you find that you want to reenable the debug printout, you don't have any record of which semicolons you need to remove again. You can avoid this cumbersome problem by using MPP to assign a set of expressions to a named debug group. However, finding these missing semicolons at clean-up time can be difficult. Then if you want to enable the debug printout, you have to repeat the whole process. A better way is:

  %define
    %Debug1 = s$;
     Debug1 = ;
  %end_define;

  a = foobar(barfoo(j.*k)) Debug1
  b = func1(a^2);
  c = func2(sqrt(a)+1) Debug1

Essentially what we have done is to assign the variables a and c to a debug variable group called Debug1. The define section shown above enables this debug group by assigning the Debug1 variable to a space. Later you can disable the debug group by replacing the define section with:

  %define
    Debug1 = s$;
    %Debug1 = ;
  %end_define;

Now Debug1 has been assigned to a semicolon instead of the space we used previously. Notice that a simple edit (moving the % character from one line to another) is all that is required to enable or disable the debugging group. Of course, each debug group can include as many variables as you want and you can have as many different debug groups as you need.

Since a semicolon terminates a literal assignment within a define section the replacement text can't normally include a semicolon character. However, there is a way around this if you really do want to include a semicolon in the replacement text. Simply use the s$ semicolon symbol wherever you want the semicolon included within the assignment text. Unlike the semicolon, the s$ doesn't terminate the replacement text.

As noted previously, mpp will not normally substitute a literal with its replacement text if the literal occurs within a quoted string. Such behavior would be dangerous and would destroy your ability to define arbitrary strings as needed. However, sometimes it is useful to be able to allow MPP to make such replacements. To allow replacements inside a quoted string, simply use $$ in place of the quote character. MPP will replace the $$ with a single quote, as well as make any replacements for all literals found within the quotes. You can also use $$ to define replacement text that has an odd number of quotes (although that is rarely useful). As explained previously, the $ character may be used in a literal only as the first or last character of the literal. The use of this special property of $ can be seen in the following somewhat extreme example (but if you can understand this one, everything else is easy):

  %define
    File$      = foobar;
    Extension$ = .ini;
    InputFile  = $$$$File$Extension$$$$$;  % file name, double quoted
    Permission = ''r+'';                   % open for reading & writing
    Action3    = 'Callback',$$fid = fopen(InputFile,Permission)$$;
  %end_define
  set(Handle1,Action3);

The output of mpp corresponding to this last line is:

set(Handle1,'Callback','fid = fopen(''foobar.ini'',''r+'')');

With less experience, we might have tried this simpler way of defining InputFile (the same as above but not use the trailing $ in File$ and Extension$ then we would have

  %define
    File      = foobar;
    Extension = .ini;
    InputFile = $$$$FileExtension$$$$;  % file name, double quoted
  %end_define

But this would not work because InputFile would no longer contain foobar or .ini because MPP would be looking up the literal FileExtension which of course has not been defined, so no substitution would occur.

Another special case that applies to define sections (or include files) is that you can specify a literal with no replacement text. In that case, MPP will assign a value to the literal equal to one plus the value of the previously defined numeric literal. The syntax of this is best described by example. Handle definitions are usually made sequentially and can be created using a define section such as:

  %define
    FIG    = 1;         % begin assigning handle definitions
    RUN    = FIG+1;     % run button
    STOP   = RUN+1;     % stop button
    SLIDEV = STOP+1;    % voltage slider
    AXIS1  = SLIDEV+1;  % axis 1
    AXIS2  = AXIS1+1;   % axis 2
    AXIS3  = AXIS2+1;   % axis 3
    PLOT1  = AXIS3+1;   % plot 1
    PLOT2  = PLOT1+1;   % plot 2
    PLOT3  = PLOT2+1;   % plot 3
  %end_define

Definitions such as this are referred to as enumerated types, and you could also use a classdef operator for this. However, that would make the code incompatible with older versions of Matlab (before the object-oriented constructs were added). Another advantage of using MPP for this purpose is that by using the special case mentioned above you could replace the above definition with the following more concise and readable one:

  %define
    FIG = 1;             % begin assigning handle definitions
    RUN; STOP;           % run, stop buttons
    SLIDEV;              % voltage slider
    AXIS1; AXIS2; AXIS3; % axis 1,2,3
    PLOT1; PLOT2; PLOT3; % plot 1,2,3
  %end_define

Caution: If you use auto-incrementing defines (as above), any of the pre-defined literals, or any literal containing the $ character, then the pre-processor source may not be run by Matlab directly. Therefore once you use any of these more advanced MPP features, you are committed to using MPP on your source code. Even without using these special features, many literal assignments will be legal and useful for MPP, which are illegal when used as input to Matlab directly. Once you resign yourself to always running your source through MPP, these problems become moot, and you will discover many ways to use MPP to make your code clearer and easier to modify.

Some editors allow you to automatically "compile" the source using MPP before exiting. That will make it less likely to accidentally skip the MPP process before running the .m file.

You may also want to invoke MPP from a make file, which might look something like this:

   .SUFFIXES:
   .SUFFIXES: .m .mi
   .mi.m:
           mpp $*.mi
   OBJS = program1.m program2.m
   all: $(OBJS)
   program1.m: program1.inc
   program2.m: program2.inc

You can learn some of these tricks for using MPP to make your code cleaner from the following example.

D. sum5s.m (a GUI example)

The previous examples merely give you an idea of the mechanics of what the preprocessor does, but this program (sum5s.mi) is an example that shows how to use MPP to achieve the benefits that I mentioned in the synopsis.

Before studying the code, run it first to see what it does. To do this, you must run the preprocessor to create sum5s.m by typing "mpp sum5s". Then run it by typing "sum5s" at the Matlab command prompt.

You will then see a figure like the one here except that the amplitude of the periodic function will be slowly varying. This movement will continue until you click on the stop button.

The goal of this example is merely to showcase ways of using MPP, so it does not use any of the pseudo objects or plotting techniques of the plt package. Only standard Matlab objects are used in this particular example. (You don't even need to have plt installed to run the sum5s program.)

Note that in this GUI example, I have chosen to use some preprocessor constructs (e.g. the definitions of the GUI objects & properties) which Matlab cannot handle directly. Therefore you cannot run sum5s without first running the source code through MPP.